An investigation of two multivariate permutation methods for controlling the false discovery proportion.

نویسندگان

  • Edward L Korn
  • Ming-Chung Li
  • Lisa M McShane
  • Richard Simon
چکیده

Identifying genes that are differentially expressed between classes of samples is an important objective of many microarray experiments. Because of the thousands of genes typically considered, there is a tension between identifying as many of the truly differentially expressed genes as possible, but not too many genes that are not really differentially expressed (false discoveries). Controlling the proportion of identified genes that are false discoveries, the false discovery proportion (FDP), is a goal of interest. In this paper, two multivariate permutation methods are investigated for controlling the FDP. One is based on a multivariate permutation testing (MPT) method that probabilistically controls the number of false discoveries, and the other is based on the Significance Analysis of Microarrays (SAM) procedure that provides an estimate of the FDP. Both methods account for the correlations among the genes. We find the ability of the methods to control the proportion of false discoveries varies substantially depending on the implementation characteristics. For example, for both methods one can proceed from the most significant gene to the least significant gene until the estimated FDP is just above the targeted level ('top-down' approach), or from the least significant gene to the most significant gene until the estimated FDP is just below the targeted level ('bottom-up' approach). We find that the top-down MPT-based method probabilistically controls the FDP, whereas our implementation of the top-down SAM-based method does not. Bottom-up MPT-based or SAM-based methods can result in poor control of the FDP.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The False Discovery Rate in Simultaneous Fisher and Adjusted Permutation Hypothesis Testing on Microarray Data

Background and Objectives: In recent years, new technologies have led to produce a large amount of data and in the field of biology, microarray technology has also dramatically developed. Meanwhile, the Fisher test is used to compare the control group with two or more experimental groups and also to detect the differentially expressed genes. In this study, the false discovery rate was investiga...

متن کامل

On correcting the overestimation of the permutation-based false discovery rate estimator

MOTIVATION Recent attempts to account for multiple testing in the analysis of microarray data have focused on controlling the false discovery rate (FDR), which is defined as the expected percentage of the number of false positive genes among the claimed significant genes. As a consequence, the accuracy of the FDR estimators will be important for correctly controlling FDR. Xie et al. found that ...

متن کامل

A Tight Prediction Interval for False Discovery Proportion under Dependence

The false discovery proportion (FDP) is a useful measure of abundance of false positives when a large number of hypotheses are being tested simultaneously. Methods for controlling the expected value of the FDP, namely the false discovery rate (FDR), have become widely used. It is highly desired to have an accurate prediction interval for the FDP in such applications. Some degree of dependence a...

متن کامل

Package 'multtest' Title Resampling-based Multiple Hypothesis Testing

Description Non-parametric bootstrap and permutation resampling-based multiple testing procedures (including empirical Bayes methods) for controlling the family-wise error rate (FWER), generalized family-wise error rate (gFWER), tail probability of the proportion of false positives (TPPFP), and false discovery rate (FDR). Several choices of bootstrap-based null distribution are implemented (cen...

متن کامل

Package ‘ multtest ’

Description Non-parametric bootstrap and permutation resampling-based multiple testing procedures (including empirical Bayes methods) for controlling the family-wise error rate (FWER), generalized family-wise error rate (gFWER), tail probability of the proportion of false positives (TPPFP), and false discovery rate (FDR). Several choices of bootstrap-based null distribution are implemented (cen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Statistics in medicine

دوره 26 24  شماره 

صفحات  -

تاریخ انتشار 2007